System, method and apparatus for machine learning

ABSTRACT

Disclosed is an artificial intelligence or machine learning algorithm that may be applied to a plurality of machine learning devices in a 5G environment connected to perform the Internet of things. A machine learning method by a first learning machine according to one embodiment of the present disclosure may include obtaining input data; determining, from among a plurality of clusters, a cluster to which the input data belongs, by using a first artificial neural network; transmitting a plurality of sample features associated with the determined cluster to a second learning device using a second artificial neural network; receiving a label for the plurality of sample features from the second learning device, in response to the transmission; and associating the received label with the determined cluster.

CROSS-REFERENCE TO RELATED APPLICATION

Pursuant to 35 U.S.C. § 119(a), this application claims the benefit ofan earlier filing date of and the right of priority to KoreanApplication No. 10-2019-0081266, filed on Jul. 5, 2019, the contents ofwhich are incorporated by reference herein in their entirety.

BACKGROUND 1. Technical Field

The present disclosure relates to machine learning, and moreparticularly, to performing joint machine learning through cooperationbetween a plurality of devices.

2. Description of Related Art

Artificial intelligence (AI) is an area of computer engineering scienceand information technology that studies methods to make computers mimicintelligent human behaviors such as reasoning, learning, self-improving,and the like. In recent years, there have been numerous attempts tointroduce an element of AI into various fields of information technologyto solve problems in the respective fields.

Machine learning is an area of artificial intelligence that includes thefield of study that gives computers the capability to learn withoutbeing explicitly programmed. More specifically, machine learning is atechnology that investigates and builds systems, and algorithms for suchsystems, which are capable of learning, making predictions, andenhancing their own performance based on experiential data. Machinelearning algorithms, rather than only executing rigidly set staticprogram commands, may be used to take an approach that builds models forderiving predictions and decisions from inputted data.

With the proliferation of the Internet of things (IoT), home appliancessuch as televisions, washing machines, refrigerators, vacuum cleaners,and air conditioners; personal terminals such as smartphones, wearabledevices, and tablets; and various sensors are interconnected in variousways to provide new services. Each individual device is also equippedwith a processor capable of machine learning.

Real-world input data to be subjected to machine learning is becomingmore diverse in, for example, type and form. However, there is alimitation in that an individual device having low processingperformance performs machine learning on its own to build a trainedmodel. In addition, since the trained model built into the device isunable to process as much data as a trained model built into, forexample, a server, a label coverage area therefore may also be verylimited. For example, while the types or forms of articles that may bestored in a refrigerator are very diverse, the range of articles thatmay be covered by the refrigerator performing machine learning on itsown is greatly limited.

Korean Patent Application Publication No. 10-2018-0025093, entitled “AMETHOD AND APPARATUS FOR MACHINE LEARNING BASED ON WEAKLY SUPERVISEDLEARNING” (hereinafter referred to as “Related Art 1”), discloses amethod capable of training a convolutional network on its own by onlyutilizing a small dataset in an environment in which a large datasetcannot be secured. However, it is difficult to apply the methoddisclosed in Related Art 1 to an individual device having low processingperformance.

SUMMARY OF THE INVENTION

One embodiment of the present disclosure is directed to addressing theshortcomings of making it difficult to build a trained model capable ofcovering various input data in an individual device with low processingperformance in the related art.

One embodiment of the present disclosure is further directed toproviding a machine learning system, method, and apparatus that allows aplurality of devices capable of performing machine learning to cooperatewith each other to extend a coverage area of a trained model.

One embodiment of the present disclosure is still further directed toproviding a machine learning system, method, and apparatus that allowthe individual device to cooperate with another device to performmachine learning with little overhead and without the burden of exposingpersonal information.

Embodiments of the present disclosure are not limited to what has beendisclosed hereinabove. Other objectives and features of the presentdisclosure which are not mentioned can be understood from the followingdescription and will be more clearly understood by the embodiments ofthe present disclosure. In addition, it is to be understood that theobjectives and features of the present disclosure may be realized byvarious means as pointed out in the appended claims and combinationthereof.

A machine learning system, method, and apparatus according to oneembodiment of the present disclosure is configured such that a firstlearning device and a second learning device may cooperate with eachother to perform joint machine learning. Specifically, the firstlearning device may be configured to determine, from among a pluralityof clusters, a cluster to which input data belongs, by using a firstartificial neural network, and the second learning device may beconfigured to determine a label for the determined cluster by using asecond artificial neural network.

A machine learning method according to one embodiment of the presentdisclosure may be performed by a first learning device, and may includeobtaining input data; determining, from among a plurality of clusters, acluster to which the input data belongs, by using a first artificialneural network; transmitting a plurality of sample features associatedwith the determined cluster to a second learning device using a secondartificial neural network; and receiving a label for the plurality ofsample features from the second learning device, in response to thetransmission.

The machine learning method according to one embodiment of the presentdisclosure may further include associating the received label with thedetermined cluster.

The received label may be determined based on analyzing the plurality ofsample features by using the second artificial neural network in thesecond learning device.

The machine learning method according to one embodiment of the presentdisclosure may further include extracting the plurality of samplefeatures from a plurality of features classified into the cluster,wherein a variance value of the extracted sample features may exceed apredetermined threshold value.

The machine learning method according to one embodiment of the presentdisclosure may further include obtaining a second input data; receivinga label for the second input data; and transmitting at least one featureof the second input data and the label for the second input data to thesecond learning device, wherein the at least one feature of the secondinput data and the label for the second input data may be used astraining data for training the second artificial neural network in thesecond learning device.

The first learning device may include a terminal, and the secondlearning device may include a server. In addition, the second artificialneural network may include more hidden layers than the first artificialneural network.

A first learning device using a first artificial neural networkaccording to one embodiment of the present disclosure may include aninput unit configured to obtain input data; a communication unitconfigured to communicate with a second learning device using a secondartificial neural network; and at least one processor, wherein the atleast one processor may be configured to determine, from among aplurality of clusters, a cluster to which the input data belongs, byusing the first artificial neural network; transmit a plurality ofsample features associated with the determined cluster to the secondlearning device via the communication unit; receive a label for theplurality of sample features via the communication unit from the secondlearning device, in response to the transmission; and associate thereceived label with the determined cluster.

The received label may be determined based on analyzing the plurality ofsample features by using the second artificial neural network in thesecond learning device.

The at least one processor may be configured to extract the plurality ofsample features from a plurality of features classified into thecluster, wherein a variance value of the extracted sample features mayexceed a predetermined threshold value.

The input unit may be configured to obtain a second input data and alabel for the second input data, the at least one processor may beconfigured to transmit at least one feature of the second input data andthe label for the second input data to the second learning device, andthe at least one feature of the second input data and the label for thesecond input data may be used as training data for training the secondartificial neural network in the second learning device.

The first learning device may include a terminal, and the secondlearning device may include a server. In addition, the second artificialneural network may include more hidden layers than the first artificialneural network.

A machine learning system according to one embodiment of the presentdisclosure may include a first learning device configured to obtaininput data; and a second learning device communicatively connected tothe first learning device, wherein the first learning device may beconfigured to determine, from among a plurality of clusters, a clusterto which the input data belongs, by using a first artificial neuralnetwork, and wherein the second learning device may be configured todetermine a label for the determined cluster by using a secondartificial neural network.

The first learning device may be configured to transmit a plurality ofsample features associated with the determined cluster to the secondlearning device, the second learning device may be configured todetermine a label for the plurality of sample features by using thesecond artificial neural network and transmit the determined label tothe first learning device, and the label for the plurality of samplefeatures may be associated with the determined cluster.

The first learning device may be configured to obtain a second inputdata and a label for the second input data, and transmit at least onefeature of the second input data and the label for the second input datato the second learning device, and the second learning device may beconfigured to train the second artificial neural network by using the atleast one feature of the second input data and the label for the secondinput data as training data.

The first learning device may include a terminal, and the secondlearning device may include a server. In addition, the second artificialneural network may include more hidden layers than the first artificialneural network.

The first learning device and the second learning device may becommunicatively connected to each other over a 5G communication network.

A machine learning method according to another embodiment of the presentdisclosure may be performed by a second learning device, and may includereceiving a plurality of sample features associated with a cluster towhich input data belongs, from a first learning device using a firstartificial neural network; determining a label for the plurality ofsample features by a second artificial neural network; and transmittingthe determined label to the first learning device, wherein thetransmitted label may be associated with the cluster by the firstlearning device.

The machine learning method according to another embodiment of thepresent disclosure may further include receiving at least one feature ofa second input data and a label for the second input data from the firstlearning device; and training the second artificial neural network byusing the at least one feature of the second input data and the labelfor the second input data as training data.

A refrigerator according to one embodiment of the present disclosure mayperform machine learning by using a first artificial neural network, andmay include a storage chamber having a plurality of storage spaces; aninput unit configured to obtain input data relating to an article to bestored in the storage chamber; a communication unit configured tocommunicate with a server using a second artificial neural network; andat least one processor, wherein the at least one processor may beconfigured to determine, from among a plurality of clusters, a clusterto which the article belongs, based on analyzing at least one feature ofthe input data by using the first artificial neural network; transmit aplurality of sample features associated with the determined cluster tothe server via the communication unit; receive a label for the pluralityof sample features via the communication unit from the server, inresponse to the transmission; and determine, from among the plurality ofstorage spaces, a recommended storage space for storing the article,based on the received label.

The received label may be determined based on analyzing the plurality ofsample features by using the second artificial neural network in theserver.

The refrigerator according to one embodiment of the present disclosuremay further include a plurality of illumination devices for illuminatingthe plurality of storage spaces, and the at least one processor may beconfigured to turn on at least one illumination device corresponding tothe recommended storage space among the plurality of illuminationdevices.

The input unit may be configured to obtain a second input data relatingto a second article and a label for the second input data, the at leastone processor may be configured to transmit at least one feature of thesecond input data and the label for the second input data to the server,and the at least one feature of the second input data and the label forthe second input data may be used as training data for training thesecond artificial neural network in the server.

The communication unit and the server may be configured to communicatewith each other over a 5G communication network.

A refrigerator control method according to one embodiment of the presentdisclosure may include obtaining input data relating to an article to bestored in a storage chamber having a plurality of storage spaces;determining, from among a plurality of clusters, a cluster to which thearticle belongs, based on analyzing at least one feature of the inputdata by using a first artificial neural network; transmitting aplurality of sample features associated with the determined cluster to aserver using a second artificial neural network; receiving a label forthe plurality of sample features from the server, in response to thetransmission; and determining, from among the plurality of storagespaces, a recommended storage space for storing the article, based onthe received label.

The received label may be determined based on analyzing the plurality ofsample features by using the second artificial neural network in theserver.

The refrigerator control method according to one embodiment of thepresent disclosure may further include turning on at least oneillumination device corresponding to the recommended storage space amongthe plurality of illumination devices for illuminating the plurality ofstorage spaces.

The refrigerator control method according to one embodiment of thepresent disclosure may further include obtaining a second input datarelating to a second article; receiving a label for the second article;and transmitting at least one feature of the second input data and thelabel for the second input data to the server, wherein the at least onefeature of the second input data and the label for the second input datamay be used as training data for training the second artificial neuralnetwork in the server.

A computer program according to one embodiment of the present disclosuremay be stored in a computer-readable storage medium, and may includeprogram code for causing a learning device to perform the machinelearning method described above.

According to one embodiment of the present disclosure, two or moredevices capable of performing machine learning may cooperate with eachother to extend a coverage area of a trained model. Therefore, theartificial intelligence or machine learning performance of an individualdevice can be improved in the Internet of things environment implementedas a 5G communication network. For example, a refrigerator and a server,which are capable of performing machine learning, may cooperate witheach other to extend the coverage area of the trained model.

According to one embodiment of the present disclosure, an artificialintelligence or machine learning performance of an individual devicehaving low processing performance can be improved. For example, since arefrigerator may determine a label for a new article with the help of aserver, the artificial intelligence or machine learning performance ofthe refrigerator can be improved.

According to one embodiment of the present disclosure, an individualdevice may cooperate with another device to perform machine learningwith little overhead and without the burden of exposing personalinformation. For example, a refrigerator may cooperate with a server toperform machine learning with little overhead and without the burden ofexposing personal information of a user using an article.

The effects of the present disclose are not limited to theabove-mentioned effects, and other effects not mentioned may be clearlyunderstood by those skilled in the art from the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other aspects and features of the disclosure, as wellas the following detailed description of the embodiments, will be betterunderstood when read in conjunction with the accompanying drawings. Forthe purpose of illustrating the present disclosure, there is shown inthe drawings an exemplary embodiment. It is to be understood, however,that the present disclosure is not intended to be limited to the detailsshown because various modifications and structural changes may be madetherein without departing from the spirit of the present disclosure andwithin the scope and range of equivalents of the claims. The use of thesame reference numerals or symbols in different drawings indicatessimilar or identical items.

FIG. 1 is a block diagram illustrating a configuration of a machinelearning device according to one embodiment of the present disclosure.

FIG. 2 is a view illustrating an operating environment of a machinelearning system according to one embodiment of the present disclosure.

FIG. 3 is a block diagram illustrating a configuration of a machinelearning system according to one embodiment of the present disclosure.

FIG. 4 is a view illustrating a coverage area of a trained model of eachdevice of a machine learning system according to one embodiment of thepresent disclosure.

FIG. 5 is a view illustrating a joint machine learning operation of amachine learning system according to one embodiment of the presentdisclosure.

FIG. 6 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure.

FIG. 7 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure.

FIG. 8 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure.

FIG. 9 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure.

FIG. 10 is a flowchart illustrating a refrigerator control methodaccording to one embodiment of the present disclosure.

DETAILED DESCRIPTION

The embodiments disclosed in the present specification will be describedin greater detail with reference to the accompanying drawings, andthroughout the accompanying drawings, the same reference numerals areused to designate the same or similar components and redundantdescriptions thereof are omitted. In the following description, thesuffixes “module” and “unit” that are mentioned with respect to theelements used in the present description are merely used individually orin combination for the purpose of simplifying the description of thepresent invention, and therefore, the suffix itself will not be used todifferentiate the significance or function or the corresponding term.Further, in the description of the embodiments of the presentdisclosure, when it is determined that the detailed description of therelated art would obscure the gist of the present disclosure, thedescription thereof will be omitted. Also, the accompanying drawings areprovided only to facilitate understanding of the embodiments disclosedin the present disclosure and therefore should not be construed as beinglimiting in any way. It should be understood that all modifications,equivalents, and replacements which are not exemplified herein but arestill within the spirit and scope of the present disclosure are to beconstrued as being included in the present disclosure.

The terms such as “first,” “second,” and other numerical terms may beused herein only to describe various elements and only to distinguishone element from another element, and as such, these elements should notbe limited by these terms.

Similarly, it will be understood that when an element is referred to asbeing “connected”, “attached”, or “coupled” to another element, it canbe directly connected, attached, or coupled to the other element, orintervening elements may be present. In contrast, when an element isreferred to as being “directly connected”, “directly attached”, or“directly coupled” to another element, no intervening elements arepresent.

Artificial intelligence (AI) is an area of computer engineering scienceand information technology that studies methods to make computers mimicintelligent human behaviors such as reasoning, learning, self-improving,and the like.

In addition, AI does not exist on its own, but is rather directly orindirectly related to a number of other fields in computer science.Particularly in recent years, there have been numerous attempts tointroduce an element of AI into various fields of information technologyto address problems of the respective fields.

Machine learning is an area of artificial intelligence that includes thefield of study that gives computers the capability to learn withoutbeing explicitly programmed. More specifically, machine learning is atechnology that investigates and builds systems, and algorithms for suchsystems, which are capable of learning, making predictions, andenhancing their own performance based on experiential data. Machinelearning algorithms, rather than only executing rigidly set staticprogram commands, may be used to take an approach that builds models forderiving predictions and decisions from input data.

Numerous machine learning algorithms have been developed for dataclassification in machine learning. Representative examples of suchmachine learning algorithms for data classification include a decisiontree, a Bayesian network, a support vector machine (SVM), an artificialneural network (ANN), and so forth.

The decision tree refers to an analysis method that uses a tree-likegraph or model of decision rules to perform classification andprediction.

The Bayesian network may include a model that represents theprobabilistic relationship (conditional independence) among a set ofvariables. Bayesian network may be appropriate for data mining viaunsupervised learning.

SVM may include a supervised learning model for pattern detection anddata analysis, heavily in classification and regression analysis.

ANN is a data processing system modelled after the mechanism ofbiological neurons and interneuron connections, in which a number ofneurons, referred to as nodes or processing elements, are interconnectedin layers.

ANNs are models used in machine learning and may include statisticallearning algorithms conceived from biological neural networks(particularly of the brain in the central nervous system of an animal)in machine learning and cognitive science. ANNs may refer generally tomodels that has artificial neurons (nodes) forming a network throughsynaptic interconnections, and acquires problem-solving capability asthe strengths of synaptic interconnections are adjusted throughouttraining.

The terms ‘artificial neural network’ and ‘neural network’ may be usedinterchangeably herein.

ANN may include a number of layers, each including a number of neurons.Furthermore, ANN may include synapses that connect the neurons to oneanother.

ANN may be defined by the following three factors: (1) a connectionpattern between neurons on different layers; (2) a learning process thatupdates synaptic weights; and (3) an activation function generating anoutput value from a weighted sum of inputs received from a previouslayer.

ANNs include, but are not limited to, network models such as a deepneural network (DNN), a recurrent neural network (RNN), a bidirectionalrecurrent deep neural network (BRDNN), a multilayer perception (MLP),and a convolutional neural network (CNN).

ANN may be classified as a single-layer neural network or a multi-layerneural network, based on the number of layers therein.

In general, the single-layer neural network may include an input layerand an output layer.

In general, the multi-layer neural network may include an input layer,one or more hidden layers, and an output layer.

The input layer receives data from an external source, and the number ofneurons in the input layer is identical to the number of inputvariables. The hidden layer is located between the input layer and theoutput layer, and receives signals from the input layer, extractsfeatures, and feeds the extracted features to the output layer. Theoutput layer receives a signal from the hidden layer and outputs anoutput value based on the received signal. Input signals between theneurons are summed together after being multiplied by correspondingconnection strengths (synaptic weights), and if this sum exceeds athreshold value of a corresponding neuron, the neuron can be activatedand output an output value obtained through an activation function.

A deep neural network with a plurality of hidden layers between theinput layer and the output layer may be the most representative type ofartificial neural network which enables deep learning, which is onemachine learning technique.

ANN can be trained by using training data. The training may refer to theprocess of determining parameters of the artificial neural network byusing the training data, to perform tasks such as classification,regression analysis, and clustering of inputted data. Such parameters ofthe artificial neural network may include synaptic weights and biasesapplied to neurons.

The artificial neural network trained by using training data canclassify or cluster inputted data according to a pattern within theinputted data.

Throughout the present disclosure, the artificial neural network trainedby using training data may be referred to as a trained model.

Hereinbelow, learning paradigms of the artificial neural network will bedescribed in detail.

Learning paradigms, in which the artificial neural network operates, maybe classified into supervised learning, unsupervised learning,semi-supervised learning, and reinforcement learning.

Supervised learning is a machine learning method that derives a singlefunction from the training data. Among the functions that may be thusderived, a function that outputs a continuous range of values may bereferred to as a regressor, and a function that predicts and outputs theclass of an input vector may be referred to as a classifier.

In supervised learning, an artificial neural network can be trained withtraining data that has been given a label. Here, the label may refer toa target answer (or a result value) to be guessed by the artificialneural network when the training data is inputted to the artificialneural network. Throughout the present disclosure, the target answer (ora result value) to be guessed by the artificial neural network when thetraining data is inputted may be referred to as a label or labelingdata. Throughout the present disclosure, assigning one or more labels tothe training data in order to train the artificial neural network may bereferred to as labeling the training data with labeling data.

Training data and label corresponding to the training data together mayform a single training set, and as such, they may be inputted to theartificial neural network as a training set.

The training data may represent a number of features, and the trainingdata being labeled with the labels may be interpreted as the featuresexhibited by the training data being labeled with the labels. In thiscase, the training data may represent a feature of an input object as avector.

Using training data and labeling data together, the artificial neuralnetwork may derive a correlation function between the training data andthe labeling data. Then, through evaluation of the function derived fromthe artificial neural network, a parameter of the artificial neuralnetwork may be determined (optimized).

Unsupervised learning is a machine learning method that learns fromtraining data that has not been given a label. More specifically,unsupervised learning may be a training scheme that trains theartificial neural network to discover a pattern within given trainingdata and perform classification by using the discovered pattern, ratherthan by using a correlation between given training data and labelscorresponding to the given training data.

Examples of unsupervised learning include, but are not limited to,clustering and independent component analysis.

Examples of artificial neural networks using unsupervised learninginclude, but are not limited to, a generative adversarial network (GAN)and an autoencoder (AE).

GAN is a machine learning method in which two different artificialintelligences, a generator and a discriminator, improve performancethrough competing with each other. The generator may be a modelgenerating new data that generates new data based on true data. Thediscriminator may be a model recognizing patterns in data thatdetermines whether inputted data is from the true data or from the newdata generated by the generator.

Furthermore, the generator may receive and learn from data that hasfailed to fool the discriminator, while the discriminator may receiveand learn from data that has succeeded in fooling the discriminator.Accordingly, the generator may evolve so as to fool the discriminator aseffectively as possible, while the discriminator evolves so as todistinguish, as effectively as possible, between the true data and thedata generated by the generator.

AE is a neural network which aims to reconstruct its input as output.More specifically, AE may include an input layer, at least one hiddenlayer, and an output layer.

Since the number of nodes in the hidden layer is smaller than the numberof nodes in the input layer, the dimensionality of data is reduced, thusleading to data compression or encoding. Furthermore, the data outputtedfrom the hidden layer may be inputted to the output layer. Given thatthe number of nodes in the output layer is greater than the number ofnodes in the hidden layer, the dimensionality of the data increases,thus leading to data decompression or decoding.

Furthermore, in AE, the inputted data is represented as hidden layerdata as interneuron connection strengths are adjusted through training.The fact that when representing information, the hidden layer is able toreconstruct the inputted data as output by using fewer neurons than theinput layer may indicate that the hidden layer has discovered a hiddenpattern in the inputted data and is using the discovered hidden patternto represent the information.

Semi-supervised learning is machine learning method that makes use ofboth labeled training data and unlabeled training data. Onesemi-supervised learning technique involves reasoning the label ofunlabeled training data, and then using this reasoned label forlearning. This technique may be used advantageously when the costassociated with the labeling process is high.

Reinforcement learning may be based on a theory that given the conditionunder which a reinforcement learning agent can determine what action tochoose at each time instance, the agent can find an optimal path to asolution solely based on experience without reference to data.

Reinforcement learning may be performed mainly through a Markov decisionprocess. Markov decision process consists of four stages: first, anagent is given a condition containing information required forperforming a next action; second, how the agent behaves in the conditionis defined; third, which actions the agent should choose to get rewardsand which actions to choose to get penalties are defined; and fourth,the agent iterates until future reward is maximized, thereby deriving anoptimal policy.

An artificial neural network is characterized by features of its model,the features including an activation function, a loss function or costfunction, a learning algorithm, an optimization algorithm, and so forth.Also, the hyperparameters are set before learning, and model parameterscan be set through learning to specify the architecture of theartificial neural network.

For instance, the structure of the artificial neural network may bedetermined by a number of factors, including the number of hiddenlayers, the number of hidden nodes included in each hidden layer, inputfeature vectors, target feature vectors, and so forth.

The hyperparameters may include various parameters which need to beinitially set for learning, much like the initial values of the modelparameters. Also, the model parameters may include various parameterssought to be determined through learning. For instance, thehyperparameters may include initial values of weights and biases betweennodes, mini-batch size, iteration number, learning rate, and so forth.Furthermore, the model parameters may include a weight between nodes, abias between nodes, and so forth.

Loss function may be used as an index (reference) in determining anoptimal model parameter during the learning process of an artificialneural network. Learning in the artificial neural network involves aprocess of adjusting the model parameters so as to reduce the lossfunction, and the purpose of learning may be to determine the modelparameters that minimize the loss function. Loss functions typically usemeans squared error (MSE) or cross entropy error (CEE), but the presentdisclosure is not limited thereto.

The cross-entropy error may be used when a true label is one-hotencoded. One-hot encoding may include an encoding method in which amonggiven neurons, only those corresponding to a target answer are given 1as a true label value, while those neurons that do not correspond to thetarget answer are given 0 as a true label value.

In machine learning or deep learning, learning optimization algorithmsmay be deployed to minimize a cost function, and examples of suchlearning optimization algorithms include gradient descent (GD),stochastic gradient descent (SGD), momentum, Nesterov accelerategradient (NAG), Adagrad, AdaDelta, RMSProp, Adam, and Nadam.

GD includes a method that adjusts model parameters in a direction thatdecreases the output of a cost function by using a current slope of thecost function.

The direction in which the model parameters are to be adjusted may bereferred to as a step direction, and a size by which the modelparameters are to be adjusted may be referred to as a step size. Here,the step size may mean a learning rate.

GD obtains a slope of the cost function through use of partialdifferential equations, using each of model parameters, and updates themodel parameters by adjusting the model parameters by a learning rate inthe direction of the slope.

SGD may include a method that separates the training dataset into minibatches, and by performing gradient descent for each of these minibatches, increases the frequency of gradient descent.

Adagrad, AdaDelta and RMSProp may include methods that increaseoptimization accuracy in SGD by adjusting the step size, and may alsoinclude methods that increase optimization accuracy in SGD by adjustingthe momentum and step direction. Adam may include a method that combinesmomentum and RMSProp and increases optimization accuracy in SGD byadjusting the step size and step direction. Nadam may include a methodthat combines NAG and RMSProp and increases optimization accuracy byadjusting the step size and step direction.

Learning rate and accuracy of the artificial neural network rely notonly on the structure and learning optimization algorithms of theartificial neural network but also on the hyperparameters thereof.Therefore, in order to obtain a good trained model, it is important tochoose a proper structure and learning algorithms for the artificialneural network, but also to choose proper hyperparameters.

In general, the artificial neural network is first trained byexperimentally setting hyperparameters to various values, and based onthe results of training, the hyperparameters can be set to optimalvalues that provide a stable learning rate and accuracy.

FIG. 1 is a block diagram illustrating a configuration of a machinelearning device according to one embodiment of the present disclosure.Referring to FIG. 1, a terminal 100 may be configured as the machinelearning device. The terminal 100 may include a wireless communicationunit 110, an input unit 120, a learning processor 130, a sensing unit140, an output unit 150, an interface unit 160, a memory 170, aprocessor 180, and a power supply unit 190.

The terminal 100 may include various electronic devices capable ofperforming machine learning. For example, the terminal 100 may beimplemented as a stationary terminal and a mobile terminal, such as amobile phone, a smartphone, a laptop computer, a terminal for digitalbroadcast, a personal digital assistant (PDA), a portable multimediaplayer (PMP), a navigation system, a slate PC, a tablet PC, anultrabook, a wearable device (for example, a smartwatch, a smart glass,and a head mounted display (HMD)), a set-top box (STB), a digitalmultimedia broadcast (DMB) receiver, a radio, a laundry machine, arefrigerator, a vacuum cleaner, an air conditioner, a desktop computer,a projector, and a digital signage. The terminal 100 may be implementedas various forms of home appliances for household use, and may be alsoimplemented as a stationary or mobile robot.

The terminal 100 may perform a function of an audio agent. The audioagent may be a program, which recognizes user's speech and outputs anaudio response appropriate to the recognized user's speech.

A trained model may be provided to the terminal 100. The trained modelmay be implemented as hardware, software, or a combination of hardwareand software, and in cases where the trained model is partially orentirely implemented as software, at least one command constituting thetrained model may be stored in the memory 170.

The wireless communication unit 110 may include at least one of abroadcast receiver module 111, a mobile communication module 112, awireless internet module 113, a short-range communication module 114, ora location information module 115.

The broadcast receiver module 111 receives broadcast signals orbroadcast-related information through a broadcast channel from anexternal broadcast management server.

The mobile communication module 112 may transmit/receive a wirelesssignal to/from at least one of a base station, an external terminal, ora server over a mobile communication network established according tothe technical standards or communication methods for mobilecommunication (for example, Global System for Mobile communication(GSM), Code Division Multiple Access (CDMA), Frequency Division MultipleAccess (FDMA), Code Division Multiple Access 2000 (CDMA2000), TimeDivision Multiple Access (TDMA), Orthogonal Frequency Division MultipleAccess (OFDMA), Single Carrier Frequency Division Multiple Access(SC-FDMA), Enhanced Voice-Data Optimized or Enhanced Voice-Data Only(EV-DO), Wideband CDMA (WCDMA), High Speed Downlink Packet Access(HSDPA), High Speed Uplink Packet Access (HSUPA), Long Term Evolution(LTE), Long Term Evolution-Advanced (LTE-A), and 5G).

The wireless internet module 113 is a module for wireless internetaccess, and may be built in or external to the terminal 100. Thewireless internet module 113 may be configured to transmit/receive awireless signal over a communication network according to wirelessinternet technologies. The wireless internet technologies may include,for example, Wireless LAN (WLAN), Wireless-Fidelity (Wi-Fi), Wi-FiDirect, Digital Living Network Alliance (DLNA), Wireless Broadband(WiBro), World Interoperability for Microwave Access (WiMAX), High SpeedDownlink Packet Access (HSDPA), High Speed Uplink Packet Access (HSUPA),Long Term Evolution (LTE), Long Term Evolution-Advanced (LTE-A), and 5G.

The short-range communication module 114 is a module for supportingshort-range communication. The short-range communication module 114 maysupport short-range communication by using at least one of Bluetooth™,Radio Frequency Identification (RFID), Infrared Data Association (IrDA),Ultra Wideband (UWB), ZigBee, Near Field Communication (NFC),Wireless-Fidelity (Wi-Fi), Wi-Fi Direct, or Wireless Universal SerialBus (USB) technologies.

The location information module 115 is a module for obtaining a locationof the terminal 100, and its representative example includes a GlobalPositioning System (GPS) module or a Wireless-Fidelity (Wi-Fi) module.When configured as the GPS module, the location information module 115may obtain the location of the terminal 100 by using a signaltransmitted from a GPS satellite.

The input unit 120 may include at least one camera 121 for obtaining animage signal, a microphone 122 for obtaining an audio signal, and a userinput unit 123 for receiving information inputted from a user. Speechdata or image data collected by the input unit 120 may be analyzed andprocessed as a user's control command.

The input unit 120 may obtain training data for model training, andinput data to be used to obtain output by using a trained model. Theinput unit 120 may obtain unprocessed input data and in this case, theprocessor 180 or the learning processor 130 may preprocess the obtaineddata, and generate training data or preprocessed input data which may beinputted for model training. The preprocessing of input data may includeextracting an input feature from the input data.

The camera 121 processes video frames, such as still images and movingimages, which are obtained by an image sensor in a video communicationmode or a photographing mode. The processed video frames may bedisplayed on a display 151 or stored in the memory 170.

The microphone 122 converts an external acoustic signal into electricalspeech data. The converted speech data may be utilized in variousmanners according to a function (or an application program beingexecuted) being performed in the terminal 100. Meanwhile, the microphone122 may implement various noise removal algorithms for removing noisegenerated in the process of receiving the external acoustic signal.

The user input unit 123 is for receiving information inputted from auser. When the information is inputted via the user input unit 123, theprocessor 180 may control the terminal 100 such that the operation ofthe terminal 100 corresponds to the inputted information.

The user input unit 123 may include a mechanical input means (forexample, a button, a dome switch, a jog wheel, and a jog switch whichare located at the front/rear or side of the terminal 100) and atouch-type input means. For example, the touch-type input means may beconfigured as a virtual key, a soft key, or a visual key which aredisplayed on a touch screen via software processing, or may beconfigured as a touch key disposed on a portion other than the touchscreen.

The learning processor 130 trains a model consisting of an artificialneural network by using training data. More specifically, the learningprocessor 130 may repeatedly train the artificial neural network byusing various training schemes described above to determine optimizedmodel parameters of the artificial neural network.

Throughout the present disclosure, the artificial neural network ofwhich parameters are determined by being trained using training data,may be referred to as a trained model. The trained model may be used toinfer a result value with respect to new input data rather than trainingdata.

The learning processor 130 may be configured to receive, classify,store, and output information to be used for data mining, data analysis,intelligent decision making, and machine learning algorithms andtechnologies.

The learning processor 130 may include one or more memory unitsconfigured to store data received, detected, sensed, generated,predefined, or outputted by other component, device, terminal, orapparatus in communication with the terminal.

The learning processor 130 may include a memory integrated orimplemented in the terminal 100. In some embodiments, the learningprocessor 130 may be implemented by using the memory 170.

Alternatively or additionally, the learning processor 130 may beimplemented by using a memory related to the terminal, such as anexternal memory directly coupled to the terminal 100 or a memorymaintained in a server in communication with the terminal 100.

In another embodiment, the learning processor 130 may be implemented byusing a memory maintained in a cloud computing environment, or anotherremote memory location accessible by the terminal 100 via acommunications method such as a network.

The learning processor 130 may generally be configured to store data inone or more databases in order to identify, index, categorize,manipulate, store, retrieve, and output data for supervised orunsupervised learning, data mining, predictive analysis, or use inanother machine. In such a case, the database may be implemented byusing the memory 170, a memory maintained in a cloud computingenvironment, or another remote memory location accessible by theterminal 100 via the communications method such as the network.

Information stored in the learning processor 130 may be used by one ormore other controllers of the terminal 100 or the processor 180 by usingone of various different types of data analysis algorithms and machinelearning algorithms.

As an example of such an algorithm, a k-nearest neighbor system, fuzzylogic (for example, possibility theory), a neural network, a Boltzmannmachine, vector quantization, a pulse neural network, a support vectormachine, a maximum margin classifier, hill climbing, an inductive logicsystem, a Bayesian network, a Petri Nets (for example, a finite statemachine, a Mealy machine, a Moore finite state machine), a classifiertree (for example, a perceptron tree, a support vector tree, a MarkovTree, a decision tree forest, an arbitrary forest), a reading model andsystem, artificial fusion, sensor fusion, image fusion, reinforcementlearning, augmented reality, pattern recognition, automated planning,and the like, may be provided.

The processor 180 may determine or predict at least one executableoperation of the terminal 100 based on information generated ordetermined by using data analysis and machine learning algorithms. Tothis end, the processor 180 may request, retrieve, receive, or utilizethe data from the learning processor 130, and may control the terminal100 such that the terminal 100 performs a predicted operation ordesirable operation of the at least one executable operation.

The processor 180 may perform various functions to implement intelligentemulation (that is, a knowledge based system, an inference system, and aknowledge acquisition system). This may be applied to various types ofsystems (for example, fuzzy logic systems), including, for example,adaptive systems, machine learning systems, and artificial neuralnetworks.

The processor 180 may also include submodules that enable operationsinvolving speech and natural language speech processing, such as an I/Oprocessing module, an environment condition module, a Speech to Text(STT) processing module, a natural language processing module, aworkflow processing module, and a service processing module.

Each of these submodules may have access to one or more systems or dataand models at the terminal, or a subset or superset thereof. Inaddition, each of these submodules may provide various functions,including a vocabulary index, user data, a workflow model, a servicemodel, and an automatic speech recognition (ASR) system.

In another embodiment, other aspect of the processor 180 or the terminalmay be implemented with the submodule, system, or data and model.

In some examples, based on data from the learning processor 130, theprocessor 180 may be configured to identify and detect a requirement,based on contextual condition or user's intent represented by a userinput or a natural language input.

The processor 180 may actively derive and obtain information to be usedto fully determine the requirement based on the contextual condition orthe user's intent. For example, the processor 180 may actively deriveinformation to be used to determine the requirement by analyzinghistorical data including, for example, historical inputs and outputs,pattern matching, unambiguous words, and input intent.

The processor 180 may determine a task flow for executing a functionresponding to the requirement based on the contextual condition or theuser's intent.

The processor 180 may be configured to collect, sense, extract, detector receive, by one or more sensing components of the terminal, signal ordata to be used in data analysis and machine learning operation, inorder to collect information for processing and storing in the learningprocessor 130.

Collecting information may include detecting information via a sensor,extracting information stored in the memory 170, or receivinginformation from another terminal, entity, or external storage devicevia communication means.

The processor 180 may collect usage history information in the terminal,and store it in the memory 170. The processor 180 may use the storedusage history information and prediction modeling to determine the bestmatch for executing a specific function.

The processor 180 may receive or detect the environment information orother information via a sensing unit 140.

The processor 180 may receive a broadcast signal or broadcast-relatedinformation, a wireless signal, or wireless data via the wirelesscommunication unit 110.

The processor 180 may receive image information (or a correspondingsignal), audio information (or a corresponding signal), data, or userinput information from the input unit 120.

The processor 180 may collect information in real time, process orclassify the information (for example, knowledge graph, command policy,personalization database, and dialog engine), and store the processedinformation in the memory 170 or the learning processor 130.

When the operation of the terminal 100 is determined based on a dataanalysis and a machine learning algorithm and technique, the processor180 may control components of the terminal 100 such that the componentsof the terminal 100 perform the determined operation. Subsequently, theprocessor 180 may execute the determined operation by controlling theterminal 100 according to the control command.

When a specific operation is performed, the processor 180 may analyzehistory information indicating the execution of the specific operationvia the data analysis and the machine learning algorithm and technique,and perform an update of previously trained information based on theanalyzed information.

Therefore, the processor 180, along with the learning processor 130, mayimprove the accuracy of the future performance of the data analysis andthe machine learning algorithm and technique based on the updatedinformation.

The sensing unit 140 may include one or more sensors for sensing atleast one of information related to the terminal 100 itself, informationon an environment surrounding the terminal 100, or user information. Forexample, the sensing unit 140 may include at least one of a proximitysensor, an illumination sensor, a touch sensor, an acceleration sensor,a magnetic sensor, a gravitational sensor (G-sensor), a gyroscopesensor, a motion sensor, an RGB sensor, an infrared sensor (IR sensor),a finger scan sensor, an ultrasonic sensor, an optical sensor (see, forexample, camera 121), a microphone (see microphone 122), a batterygauge, an environment sensor (for example, a barometer, a hygrometer, athermometer, a radiation detection sensor, a heat detection sensor, anda gas sensing sensor), or a chemical sensor (for example, an electronicnose, a healthcare sensor, and a biometric sensor). Meanwhile, theterminal 100 disclosed herein may combine and utilize information sensedby at least two sensors among these sensors.

The output unit 150 is for generating an output such as a visual output,an audible output, or a haptic output, and may include at least one of adisplay 151, an acoustic output unit 152, a haptic module 153, or alight output unit 154.

The display 151 is configured to display (output) information processedin the terminal 100. For example, the display 151 may display executionscreen information on the application program executed in the terminal100, or a user interface (UI) and graphic user interface (GUI)information according to the execution screen information.

Since the display 151 may form a mutually layered structure with thetouch sensor or may be formed integrally with the touch sensor, thedisplay 141 may implement a touch screen. This touch screen may functionas the user input unit 123 to provide an input interface between theterminal 100 and the user, and at the same time may provide an outputinterface between the terminal 100 and the user.

The acoustic output unit 152 may be configured to output speech datareceived from the wireless communication unit 110 or stored in thememory 170, for example, in a call signal reception mode, a call mode, arecord mode, a speech recognition mode, and a broadcast reception mode.The acoustic output unit 152 may include at least one of a receiver, aspeaker, or a buzzer.

The haptic module 153 is configured to generate various haptic effectsthat the user can feel. A representative example of the haptic effectsgenerated by the haptic module 153 may include vibration.

The light output unit 154 is configured to output a signal for notifyingan event occurrence by using light from a light source of the terminal100. Examples of the event capable of occurring in the terminal 100 mayinclude, for example, message reception, call signal reception, missedcall, alarm, schedule notification, email reception, and reception ofinformation via an application.

The interface unit 160 is configured to serve as a path for connectionbetween the terminal 100 and various types of external devices. Thisinterface unit 160 may include at least one of a wired/wireless headsetport, an external charger port, a wired/wireless data port, a memorycard port, a port for connecting a device having an identificationmodule, an audio input/output (I/O) port, a video input/output (I/O)port, or an earphone port. In response to an external device beingconnected to the interface unit 160, the terminal 100 may control theconnected external device.

Meanwhile, the identification module is a chip for storing various typesof information for authenticating the usage right of the terminal 100,and may include, for example, a user identify module (UIM), a subscriberidentity module (SIM), and a universal subscriber identity module(USIM). The device having the identification module (hereinafterreferred to as ‘identification device’) may be manufactured in the formof a smart card. Accordingly, the identification device may be connectedto the terminal 100 through the interface unit 160.

The memory 170 stores data for supporting various functions of theterminal 100. The memory 170 may store a plurality of applicationprograms (or applications) executed in the terminal 100, data foroperating the terminal 100, commands, and data for operating thelearning processor 130 (for example, at least one algorithm informationfor machine learning).

The memory 170 may store a model trained by the learning processor 130.The memory 170 may classify the trained model into a plurality ofversions depending on, for example, a training time or a trainingprogress, where necessary, and may store the classified trained model.

The memory 170 may store, for example, input data obtained by the inputunit 120, learning data (or training data) used for model training, andtraining history of a model. The input data stored in the memory 170 maybe data suitably processed for model training, as well as unprocessedinput data itself.

The processor 180 typically controls the overall operation of theterminal 100, in addition to the operations associated with theapplication program. The processor 180 may provide the user withappropriate information or functionality or may process it, byprocessing, for example, signals, data, and information inputted oroutputted via the above-mentioned components or by executing theapplication program stored in the memory 170.

The processor 180 may control at least some of the components shown inFIG. 1 such that the application program stored in the memory 170 isexecuted. In addition, the processor 180 may operate at least two of thecomponents included in the terminal 100 in combination to execute theapplication program.

Meanwhile, as described above, the processor 180 typically controls theoverall operation of the terminal 100, in addition to the operationsassociated with the application program. For example, when the state ofthe terminal 100 satisfies a predetermined condition, the processor 180may execute a lock state to prevent the user from inputting a controlcommand for the applications, or may release the lock state.

Under the control of the processor 180, the power supply unit 190 issupplied with external power or internal power, and supplies power toeach component included in the terminal 100. This power supply unit 190may include a battery, which may be an internal battery or a replaceablebattery.

FIG. 2 is a block diagram illustrating an environment of a machinelearning system according to one embodiment of the present disclosure.Referring to FIG. 2, a machine learning system 200 according to oneembodiment of the present disclosure may include a cloud network 210, aserver 220, a home appliance 230, a smartphone 240, an XR device 250, anautonomous vehicle 260, and a robot 270.

The cloud network 210 may refer to a network that forms part of a cloudcomputing infrastructure or exists in a cloud computing infrastructure.This cloud network 210 may include, but is not limited to, wirednetworks such as local area networks (LANs), wide area networks (WANs),metropolitan area networks (MANs), and integrated service digitalnetworks (ISDNs); or wireless networks such as wireless LANs, CDMA,WCDMA, LTE, LTE-A, 5G, Bluetooth™, and satellite communications.

The cloud network 210 may include connections of network elements, suchas hubs, bridges, routers, switches, and gateways. The cloud network 210may include one or more connected networks such as a multi-networkenvironment, including a public network such as the Internet and aprivate network such as a secure enterprise private network. Access tothe cloud network 210 may be provided over one or more wired or wirelessaccess networks. In addition, the cloud network 210 may support varioustypes of object intelligence communications, such as Internet of things(IoT), Internet of everything (IoE), and Internet of small things(IoST), or a 5G communication, to exchange and process informationbetween distributed components such as objects.

The devices 220, 230, 240, 250, 260, 270 constituting the machinelearning system 200 may be connected to each other over the cloudnetwork 210. The devices 220, 230, 240, 250, 260, 270 may communicatewith each other via a base station, but may also communicate with eachother directly without the base station.

Each of the devices 220, 230, 240, 250, 260, 270 constituting themachine learning system 200 may be configured to include all or a partof the components of the terminal 100 shown in FIG. 1. In addition tothe illustrated devices 220, 230, 240, 250, 260, 270, various electronicdevices may be included in the machine learning system 200.

FIG. 3 is a block diagram illustrating a configuration of a machinelearning system according to one embodiment of the present disclosure.FIG. 4 is a view illustrating a coverage area of a trained model of eachlearning device of a machine learning system according to one embodimentof the present disclosure.

Referring to FIG. 3, the machine learning system 300 according to oneembodiment of the present disclosure may include a first learning device300 a and a second learning device 300 b.

The first learning device 300 a and the second learning device 300 b maybe any one of the terminal 100 shown in FIG. 1 or the devices 220, 230,240, 250, 260, 270 shown in FIG. 2. For example, the first learningdevice 300 a and the second learning device 300 b may be implemented asthe same or different types of terminals 100, and at least one of thefirst learning device 300 a or the second learning device 300 b may beimplemented as the server 220.

The first learning device 300 a may include a communication unit 310 a,an input unit 320 a, a memory 330 a, a learning processor 340 a, a powersupply unit 350 a, and a processor 360 a. The second learning device 300b may include a communication unit 310 b, an input unit 320 b, a memory330 b, a learning processor 340 b, a power supply unit 350 b, and aprocessor 360 b.

The communication units 310 a, 310 b may correspond to the configurationincluding the wireless communication unit 110 and the interface unit 160shown in FIG. 1. The input units 320 a, 320 b, the learning processors340 a, 340 b, the power supply units 350 a, 350 b, and the processors360 a, 360 b may correspond to the input unit 120, the learningprocessor 130, the power supply unit 190, and processor 180 shown inFIG. 1, respectively.

The memories 330 a, 330 b may correspond to the memory 170 shown FIG. 1.The memories 330 a, 330 b may include model storage units 331 a, 331 band databases 333 a, 333 b, respectively.

The model storage units 331 a, 331 b store a model (or artificial neuralnetworks 332 a, 332 b) trained or being trained via learning processors340 a, 340 b, and when the model is updated through training, store theupdated model. The model storage units 331 a, 331 b may classify thetrained model into a plurality of versions depending on, for example, atraining time or a training progress, where necessary, and may store theclassified trained model.

The databases 333 a, 333 b store, for example, input data obtained bythe input units 320 a, 320 b, learning data (or training data) used formodel training, and training history of a model. The input data storedin the databases 333 a, 333 b may be data suitably processed for modeltraining, as well as unprocessed input data itself.

The first learning device 300 a may perform machine learning by usingthe first artificial neural network 332 a. The second learning device300 b may perform machine learning by using the second artificial neuralnetwork 332 b. The first and second artificial neural networks 332 a,332 b may be implemented as hardware, software, or a combination ofhardware and software. When the first and second artificial neuralnetworks 332 a, 332 b are partially or completely implemented assoftware, one or more commands constituting the first and secondartificial neural networks 332 a, 332 b may be stored in the memories330 a, 330 b. The first and second artificial neural networks 332 a, 332b illustrated in FIG. 3, are provided as one example of an artificialneural network including a plurality of hidden layers. Accordingly, theartificial neural network according to one embodiment of the presentdisclosure is not limited thereto.

The first learning device 300 a and the second learning device 300 baccording to one embodiment of the present disclosure may cooperate witheach other to perform joint machine learning. For convenience ofexplanation, joint machine learning between two learning devices isdescribed, but the same or similar may be applied to three or morelearning devices.

The term “joint machine learning” used herein means that a plurality oflearning devices capable of machine learning on their own performmachine learning in a shared manner, and is distinguished from the factthat one device performs machine learning entirely on behalf of anotherdevice.

Due to, for example, a difference in processing performance or adifference in battery performance between the first learning device 300a and the second learning device 300 b, the coverage area of the firsttrained model based on the first artificial neural network 332 a and thecoverage area of the second trained model based on the second artificialneural network 332 b may not match. The coverage area of the trainedmodel may represent a range of input data that may be labeled by thetrained model.

FIG. 4 illustrates an overall coverage area 410 of a label for variousinput data existing in the real world, a second coverage area 420 of thesecond trained model of the second learning device 300 b, and a firstcoverage area 430 of the first trained model of the first learningdevice 300 a. In one embodiment, the first coverage area 430 and thesecond coverage area 420, which belong to the overall coverage area 410,may have an overlapping area b in common, and may have non-overlappingareas a, c, respectively.

One embodiment of the present disclosure allows the first learningdevice 300 a and the second learning device 300 b to cover, by jointmachine learning, labels that are unable to be covered by their trainedmodels. In one embodiment, the first learning device 300 a may cover theinput data having a label of area a by joint machine learning with thesecond learning device 300 b, and the second learning device 300 b maycover the input data having a label of area c by joint machine learningwith the first learning device 300 a. Therefore, the label coverage areaof each learning device may be expanded. Each learning device may alsogradually cover the input data having a label of area d by additionaljoint machine learning with other learning device.

In some embodiments, the processing performance of the second learningdevice 300 b may be better than the processing performance of the firstlearning device 300 a. For example, the first learning device 300 a maybe configured as a terminal having a relatively low processingperformance, and the second learning device 300 b may be configured as aserver having a relatively high processing performance. In such a case,the second learning device 300 b may be implemented, for example, as aplurality of server sets, a cloud server, or a combination thereof.

In one embodiment, the coverage area of the second trained model basedon the second artificial neural network 332 b may be wider than thecoverage area of the first trained model based on the first artificialneural network 332 a. The coverage area of the second trained model mayor may not fully include the coverage area of the first trained model.

In another embodiment, the second artificial neural network 332 b mayinclude more hidden layers than the first artificial neural network 332a. For example, the first artificial neural network 332 a may beshallower than the second artificial neural network 332 b.

In yet another embodiment, the first artificial neural network 332 a andthe second artificial neural network 332 b may use different learningmethods. For example, the first artificial neural network 332 a mayperform unsupervised learning for input data or training data that hasnot been given a label, while the second artificial neural network 332 bmay perform supervised learning for input data or training data that hasbeen given a label.

The input unit 320 a of the first learning device 300 a obtains theinput data, and provides the processor 360 a with the obtained inputdata. The processor 360 a or the learning processor 340 a of the firstlearning device 300 a analyzes the at least one feature of the inputdata by using the first artificial neural network 332 a. When a labelfor the input data is determined by the first artificial neural network332 a, the processor 360 a of the first learning device 300 a maycontrol the operation of the first learning device 300 a based on thedetermined label.

However, the label for the input data may not be determined by the firstartificial neural network 332 a. In other words, a first trained modelbased on the first neural network 332 a may not have the label for inputdata. In such a case, the processor 360 a or the learning processor 340a of the first learning device 300 a may cluster the input data byperforming the unsupervised learning by using the first neural network332 a. As a result of the clustering, the input data may be classifiedinto any one of a plurality of clusters (or classes).

In one embodiment, the first neural network 332 a may perform class-wiseadaptation learning to cluster the input data. Other various algorithmscapable of clustering the input data may be applied to the first neuralnetwork 332 a.

The processor 360 a or the learning processor 340 a of the firstlearning device 300 a may extract a plurality of sample features from aplurality of features associated with the determined cluster, andtransmit the extracted sample features to the second learning device 300b via the communication unit 310 a. In other words, the first learningdevice 300 a may cluster the input data having the unidentified labelinto a specific cluster, and then extract, from among the features ofthe input data and the features previously classified as the specificcluster, some sample features. Extracting and transmitting the samplefeatures representative of the cluster may reduce overhead compared totransmitting the input data as is or transmitting all the featuresassociated with the cluster or class. In addition, since only thefeatures are transmitted, no personal information of the user of thefirst learning device 300 a is provided to the second learning device300 b.

The sample features may be extracted in a variety of ways from theplurality of features associated with the cluster. In one embodiment,the sample features may be extracted to include features located at thecenter of the cluster. These sample features may represent the mostbasic features of the cluster. In another embodiment, the samplefeatures may be extracted such that a variance value of the extractedsample features exceeds a predetermined threshold value. That is, thesample features may be extracted to include not only features located atthe center of the cluster, but also features located further outside ofthe cluster. Because these sample features may represent features of theentire cluster, a label determined based on the sample features may beassociated with features of the entire cluster. In such a case, thethreshold value may be selected as various values depending on, forexample, the number of features associated with the cluster, the numberof sample features to be extracted, and the design intent. In yetanother embodiment, the sample features may be extracted at random, orbe extracted to form a predetermined distribution.

The communication unit 310 b of the second learning device 300 bprovides the processor 360 b with a plurality of sample featuresreceived from the first learning device 300 a. The processor 360 b orthe learning processor 340 b of the second learning device 300 bdetermine the label for the plurality of sample features by analyzingthe plurality of sample features by using the second neural network 332b. The label for the plurality of sample features may be determinedbased on an output value from the second artificial neural network 332 bthat has received the plurality of sample features. The processor 360 bof the second learning device 300 b transmits the determined label tothe first learning device 300 a via the communication unit 310 b.

The communication unit 310 a of the first learning device 300 a providesthe processor 360 a with the label received from the second learningdevice 300 b. The processor 360 a or the learning processor 340 a of thefirst learning device 300 a may associate the received label with apreviously determined cluster. The processor 360 a of the first learningdevice 300 a may control the operation of the first learning device 300a based on the label associated with the cluster.

In one embodiment, the label received from the second learning device300 b may be used to train the first artificial neural network 332 a inthe first learning device 300 a. For example, at least one feature ofthe input data, features associated with the cluster, and the receivedlabel may be used as training data for training the first neural network332 a.

In another embodiment, the label received from the second learningdevice 300 b may be used only to control the operation of the firstlearning device 300 a, and may be not used to train the first neuralnetwork 332 a. This selection may be preferred when the processingperformance of the first learning device 300 a is low or the storagespace thereof is not sufficient. In addition, since the communicationbetween the first learning device 300 a and the second learning device300 b in the 5G communication environment may take place in real timewith little latency, it may be preferred that the first learning device300 a receives a label from the second learning device 300 b whenneeded.

Meanwhile, not only the input data but also the label for the input datamay be obtained via the input unit 320 a of the first learning device300 a. In one embodiment, when the label for the input data is notdetermined by the first artificial neural network 332 a, the firstlearning device 300 a may obtain the label for the input data byinducing the user to input the label for the input data. In such a case,the processor 360 a of the first learning device 300 a may control theoperation of the first learning device 300 a based on the inputtedlabel.

The label for the input data provided by the user may be a label thatwas not covered by both the first trained model of the first learningdevice 300 a and the second trained model of the second learning device300 b before being provided by the user.

Accordingly, the first learning device 300 a that has obtained the labelfor the input data may transmit at least one feature of the input dataand the label for the input data to the second learning device 300 b viathe communication unit 310 a.

The communication unit 310 b of the second learning device 300 bprovides the processor 360 b with at least one feature of the input dataand the label for the input data, which are received from the firstlearning device 300 a. The processor 360 b or the learning processor 340b of the second learning device 300 b may train the second artificialneural network 332 b by using the at least one feature of the input dataand the label of the input data as training data.

According to one embodiment of the present disclosure, the firstlearning device 300 a and the second learning device 300 b may cooperatewith each other to extend the coverage area of the trained model. Inparticular, the artificial intelligence or machine learning performanceof the first learning device 300 a and the second learning device 300 bcan be improved in the Internet of things environment implemented as the5G communication network.

FIG. 5 is a view illustrating joint machine learning of a machinelearning system according to one embodiment of the present disclosure.Referring to FIG. 5, the joint machine learning may be performed byusing a first artificial neural network 510 and a second artificialneural network 520. In FIG. 5, Xt represents features of an input layerof the first artificial neural network 510, and Ht(k) representsfeatures of the k^(th) hidden layer of the first artificial neuralnetwork 510. Ht, j(k) represents features of the k^(th) hidden layer forthe j^(th) sample. Xs represents features of an input layer of thesecond artificial neural network 520, Hs(k) represents features of thek^(th) hidden layer of the second artificial neural network 520, and Ysrepresents a label of the second artificial neural network 520.

The first artificial neural network 510 may include m hidden layers, andthe second artificial neural network 520 may include n hidden layers.When the first artificial neural network 510 is used by a learningdevice having a relatively low processing performance and the secondartificial neural network 520 is used by a learning device having arelatively high processing performance, n may be greater than m. Thatis, the first artificial neural network 510 may be shallower than thesecond artificial neural network 520. However, due to other factors, therelationship between n and m may be determined regardless of theprocessing performance of the learning device.

The first artificial neural network 510 may perform unsupervisedlearning to determine, from among a plurality of clusters, a cluster (orclass) 515 to which input data belongs. A plurality of sample featuresmay be extracted from a plurality of features belonging to thedetermined cluster (or class) 515, and the extracted sample features maybe inputted to the second artificial neural network 520. The secondartificial neural network 520 may determine a label 525 for the samplefeatures, and the label 525 for the sample features may be associatedwith a cluster or class 515 determined earlier by the first artificialneural network 510.

The first artificial neural network 510 and the second artificial neuralnetwork 520 may be provided to different learning devices, respectively.However, in some instances, the first artificial neural network 510 andthe second artificial neural network 520 may be provided to differentmodules that perform different functions within the same device.

FIG. 6 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure. The machine learning methodshown in FIG. 6 may be performed by a first learning device using afirst artificial neural network. In one embodiment, the first learningdevice may be a terminal, and a second learning device may be a server.

The first learning device obtains input data (S610). The first learningdevice analyzes at least one feature of the input data by using thefirst artificial neural network (S620). The first learning devicedetermines whether a label for the input data is determined by the firstartificial neural network (S630). When the label for the input data isdetermined by the first artificial neural network, the method accordingto one embodiment of the present disclosure ends. Subsequently, thefirst learning device may control the operation of the first learningdevice based on the determined label.

When the label for the input data is not determined by the firstartificial neural network, the first learning device determines, fromamong a plurality of clusters, a cluster to which the input databelongs, by using the first artificial neural network (S640). The firstlearning device extracts a plurality of sample features from a pluralityof features associated with the determined cluster (S650), and transmitsthe extracted sample features to the second learning device using asecond artificial neural network (S660).

The first learning device receives the label for the plurality of samplefeatures from the second learning device, in response to thetransmission of the plurality of sample features (S670). The receivedlabel may be determined by analyzing the plurality of sample features byusing the second artificial neural network in the second learningdevice. The first learning device associates the received label with thedetermined cluster (S680). Subsequently, the first learning device maycontrol the operation of the first learning device based on the receivedlabel.

FIG. 7 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure. The machine learning methodshown in FIG. 7 may be performed by a first learning device using afirst artificial neural network. In one embodiment, the first learningdevice may be a terminal, and a second learning device may be a server.

The first learning device obtains input data (S710). The first learningdevice analyzes the at least one feature of the input data by using thefirst artificial neural network (S720). The first learning devicedetermines whether a label for the input data is determined by the firstartificial neural network (S730). When the label for the input data isdetermined by the first artificial neural network, the method accordingto one embodiment of the present disclosure ends. The first learningdevice may control the operation of the first learning device based onthe determined label.

When the label for the input data is not determined by the firstartificial neural network, the first learning device receives the labelfor the input data (S740). In one embodiment, the first learning devicemay obtain the label for the input data by inducing the user to inputthe label for the input data.

The first learning device transmits at least one feature of the inputdata and the label for the input data to the second learning deviceusing a second artificial neural network (S750). The at least onefeature of the input data and the label for the input data, which aretransmitted to the second learning device, may be used as training datafor training the second artificial neural network in the second learningdevice.

FIG. 8 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure. The machine learning methodshown in FIG. 8 may be performed by a second learning device using asecond artificial neural network. In one embodiment, a first learningdevice may be a terminal, and the second learning device may be aserver.

The second learning device receives, from the first learning deviceusing a first artificial neural network, a plurality of sample featuresassociated with a cluster to which input data belongs (S810). The secondlearning device determines a label for the plurality of sample featuresby using the second artificial neural network (S820), and transmits thedetermined label to the first learning device (S830). The transmittedlabel may be associated with the cluster by the first learning device.

FIG. 9 is a flowchart illustrating a machine learning method accordingto one embodiment of the present disclosure. The machine learning methodshown in FIG. 9 may be performed by a second learning device using asecond artificial neural network. In one embodiment, a first learningdevice may be a terminal, and the second learning device may be aserver.

The second learning device receives, from the first learning deviceusing a first artificial neural network, at least one feature of inputdata and a label for the input data (S910). The label for the input datamay be obtained by the first learning device from a user. The secondlearning device trains the second artificial neural network by using theat least one feature of the input data and the label for the input dataas training data (S920).

Meanwhile, the embodiments described above can be applied to jointmachine learning between a refrigerator and a server, and thus beutilized to manage articles stored in the refrigerator.

The real-world articles that may be stored in the refrigerator arediverse, and even the same type of articles are diverse in form andcondition (for example, milk, opened milk, an apple, a half-eaten apple,and a cut apple). Even if a processor capable of machine learning isprovided to an individual refrigerator, the individual refrigerator haslimitations in identifying the types and names of various articles.

According to one embodiment of the present disclosure, since therefrigerator may determine a label for a new article with the help ofthe server, an artificial intelligence or machine learning performanceof the refrigerator may be improved. In addition, according to oneembodiment of the present disclosure, the refrigerator may cooperatewith the server to perform joint machine learning with little overheadand without the burden of exposing personal information of a user usingthe article.

The refrigerator is a home appliance capable of storing foodstuffs atlow temperatures in a storage chamber that is shielded by a plurality ofdoors. The refrigerator may cool storage spaces by using cold airgenerated from heat exchange with a refrigerant circulating in arefrigeration cycle, thereby keeping the foodstuffs stored therein in arefrigerated or frozen state.

The refrigerator may be implemented to include some components of theterminal 100 shown in FIG. 1 or some components of the first learningdevice 300 a shown in FIG. 3. The refrigerator 100 according to oneembodiment of the present disclosure may include a wirelesscommunication unit 110, an input unit 120, a learning processor 130, asensing unit 140, an output unit 150, an interface unit 160, a memory170, a processor 180, and a power supply unit 190. Explanationsoverlapping with FIGS. 1 and 3 will be omitted, and differencestherebetween will be mainly described.

The refrigerator includes a storage chamber having a plurality ofstorage spaces, and the plurality of storage spaces may be variouslyclassified depending on, for example, types, forms, and storagetemperatures of the stored articles.

The input unit 120 may include a camera 121, a microphone 122, and auser input unit 123.

The camera 121 may be disposed at the front of the refrigerator toobtain image data of the user holding the article. In anotherembodiment, the camera 121 may be disposed at one side of a door toobtain image data of the user holding the article with the door beingopened or image data of the article to be loaded into the storagechamber. In addition, the camera 121 may be disposed in each of theplurality of storage spaces to obtain image data of the articles storedin the each storage space and image data of an empty space in eachstorage space.

The microphone 122 may obtain a user's utterance speech (speechcommand). To more accurately obtain the user's utterance speech, aplurality of microphones 122 may be disposed. The plurality ofmicrophones 122 may be disposed at different locations by being spacedapart from each other. In one embodiment, the microphone 122 may receivethe user's utterance speech that notifies a label for the article to bestored in the refrigerator.

The user input unit 123 may receive information from the user. In oneembodiment, the user input unit 123 may receive information thatindicates the label for the article to be stored in the refrigerator.

The output unit 150 may include a display 151, an acoustic output unit152, and a light output unit 154.

The display 151 may display, for example, information corresponding tothe user's speech data (speech command), information on a processingresult corresponding to the user's utterance speech, information on thestorage state of the article in the refrigerator, and information on therecommended storage space for storing the article. The display 151 maybe implemented as a touch screen to function as the user input unit 123and to provide an output interface.

The acoustic output unit 152 may output by audio, for example, variousnotification messages, information corresponding to the user's utterancespeech, and a processing result. In one embodiment, the acoustic outputunit 152 may output utterance-inducing audio for inducing the user toinput a label for the article to be stored in the refrigerator.

The light output unit 154 may include a plurality of illuminationdevices for illuminating the plurality of storage spaces in the storagechamber. At least one illumination device may be disposed in eachstorage space.

The processor 180 receives, from the input unit 120, input data relatingto the article to be stored in the storage chamber. The processor 180 orthe learning processor 130 analyzes the at least one feature of theinput data by using the first artificial neural network.

When the label for the article is not determined by the first artificialneural network, the processor 180 may attempt to obtain the label forthe article via the input unit 120 from the user. Nevertheless, when thelabel for the article is not inputted, the processor 180 or the learningprocessor 130 may cluster the input data by performing unsupervisedlearning by using the first artificial neural network. As a result ofthe clustering, the corresponding article may be classified into any oneof a plurality of clusters (or classes).

The processor 180 or the learning processor 130 may extract a pluralityof sample features from a plurality of features associated with thedetermined cluster, and transmit the extracted sample features to theserver via the wireless communication unit 110. Extracting andtransmitting the sample features may reduce overhead compared totransmitting the input data as is or transmitting all the featuresassociated with the cluster. In addition, since only the features aretransmitted, no personal information of the user of the refrigerator isprovided to the server.

The processor 180 receives the label for the sample features from theserver, in response to the transmission of the sample features. Thislabel may be determined by analyzing the sample features by using thesecond artificial neural network in the server.

The processor 180 determines, from among the plurality of storagespaces, a recommended storage space for storing the correspondingarticle, based on the label determined by the first artificial neuralnetwork, the label inputted by the user, or the label received from theserver. The processor 180 may determine the recommended storage spacefor storing the corresponding article based on, for example, the labelof the corresponding article, the information on the empty space amongthe plurality of storage spaces, and the storage temperature of thecorresponding article.

The processor 180 may inform the user of the determined recommendedstorage space via the output unit 150. The processor 180 may inform theuser of the recommended storage space for storing the article, via atleast one of the display 151 or the acoustic output unit 152. In anotherembodiment, the processor 180 may inform the user of the recommendedstorage space for storing the article, by turning on at least oneillumination device corresponding to the recommended storage space amongthe plurality of illumination devices of the light output unit 154.

FIG. 10 is a flowchart illustrating a refrigerator control methodaccording to one embodiment of the present disclosure. The method shownin FIG. 10 may be performed by a refrigerator communicatively connectedto a server.

The input unit 120 of the refrigerator obtains input data relating to anarticle to be stored in a storage chamber (S1010). The input data mayinclude image data of the article.

The processor 180 or the learning processor 130 of the refrigeratoranalyzes at least one feature of the input data relating to the articleby using a first artificial neural network (S1020). The processor 180 orthe learning processor 130 determines whether a label for the article isdetermined (S1030). When the label for the article is determined by thefirst artificial neural network, the processor 180 may determine arecommended storage space for storing the article based on thedetermined label (S1090), and inform the user of the recommended storagespace (S1095).

When the label for the article is not determined by the first artificialneural network, the processor 180 checks whether the label for thearticle is inputted via the input unit 120 (S1040). When the label forthe article is inputted, the processor 180 transmits the at least onefeature of the input data and the label for the article to the serverusing the second artificial neural network via the wirelesscommunication unit 110 (S1045). The at least one feature of the inputdata and the label for the input data, which are transmitted to theserver, may be used as training data for training a second artificialneural network in the server. Subsequently, the processor 180 maydetermine the recommended storage space for storing the article based onthe inputted label (S1090), and inform the user of the recommendedstorage space (S1095).

When the label for the article is not inputted, the processor 180 or thelearning processor 130 determines a cluster to which the articlebelongs, by analyzing the at least one feature of the input data byusing the first artificial neural network (S1050). The processor 180 orthe learning processor 130 extracts a plurality of sample features froma plurality of features associated with the determined cluster (S1060),and transmits the extracted sample features to the server using thesecond artificial neural network via the wireless communication unit 110(S1070).

The processor 180 or the learning processor 130 receives the label forthe plurality of sample features from the server, in response to thetransmission of the plurality of sample features (S1080). The receivedlabel may be determined by analyzing the plurality of sample features byusing the second artificial neural network in the server. The processor180 or the learning processor 130 may determine the recommended storagespace for storing the article based on the received label (S1090), andinform the user of the recommended storage space (S1095).

The example embodiments described above may be implemented throughcomputer programs executable through various components on a computer,and such computer programs may be recorded on computer-readable media.Examples of the computer-readable media include, but are not limited to:magnetic media such as hard disks, floppy disks, and magnetic tape;optical media such as CD-ROM disks and DVD-ROM disks; magneto-opticalmedia such as floptical disks; and hardware devices that are speciallyconfigured to store and execute program codes, such as ROM, RAM, andflash memory devices.

The computer programs may be those specially designed and constructedfor the purposes of the present disclosure or they may be of the kindwell known and available to those skilled in the computer software arts.Examples of program code include both machine code, such as codeproduced by a compiler, and higher level code that may be executed bythe computer using an interpreter.

As used in the present disclosure (especially in the appended claims),the terms “a/an” and “the” include both singular and plural references,unless the context clearly states otherwise. Also, it should beunderstood that any numerical range recited herein is intended toinclude all sub-ranges subsumed therein (unless expressly indicatedotherwise) and therefore, the disclosed numeral ranges include everyindividual value between the minimum and maximum values of the numeralranges.

Also, the order of individual steps in process claims of the presentdisclosure does not imply that the steps must be performed in thisorder; rather, the steps may be performed in any suitable order, unlessexpressly indicated otherwise. In other words, the present disclosure isnot necessarily limited to the order in which the individual steps arerecited. Therefore, it should be understood that the scope of thepresent disclosure is not limited to the example embodiments describedabove or by the use of such terms unless limited by the appended claims.Also, it should be apparent to those skilled in the art that variousalterations, substitutions, and modifications may be made within thescope of the appended claims or equivalents thereof.

The present disclosure is thus not limited to the example embodimentsdescribed above, and rather intended to include the following appendedclaims, and all modifications, equivalents, and alternatives fallingwithin the spirit and scope of the following claims.

What is claimed is:
 1. A machine learning method, comprising: obtaininga first input data; determining, from a plurality of clusters, a clusterto which the first input data corresponds, the determination being madeby processing the first input data in a first artificial neural networkof a first device; transmitting a plurality of sample featuresassociated with the determined cluster to a second artificial neuralnetwork of a second device; generating a first label corresponding tothe transmitted plurality of sample features by analyzing thetransmitted plurality of sample features in the second artificial neuralnetwork of the second device; transmitting the first label from thesecond device to the first device; and assigning the transmitted firstlabel to the determined cluster by the first device.
 2. The machinelearning method of claim 1, further comprising: extracting the pluralityof sample features from a plurality of features representing at least anentirety of the cluster, and wherein a variance value of the extractedplurality of sample features exceeds a predetermined threshold value. 3.The machine learning method of claim 1, further comprising: obtaining asecond input data; receiving a second label corresponding to the secondinput data; and transmitting at least one feature of the second inputdata and the second label from the first device to the second artificialneural network of the second device, training the second artificialneural network of the second device by using at least one feature of thesecond input data and the second label as input and output of the secondartificial neural network respectively.
 4. The machine learning methodof claim 1, wherein the first device comprises a terminal, and thesecond device comprises a server.
 5. The machine learning method ofclaim 1, wherein an amount of hidden layers of the second artificialneural network of the second device is larger than an amount of hiddenlayers of the first artificial neural network of the first device. 6.The machine learning method of claim 1, wherein the first device and thesecond device communicate with each other over a 5G communicationnetwork.
 7. The machine learning method of claim 1, wherein the firstartificial neural network of the first device performs unsupervisedlearning, and the second artificial neural network of the second deviceperforms supervised learning.
 8. A device configured to perform machinelearning, comprising: an input unit configured to obtain a first inputdata; a communication unit configured to communicate with externaldevices; and at least one processor configured to: determine, from aplurality of clusters, a cluster to which the first input datacorresponds, the determination being made by processing the first inputdata in an artificial neural network of the device; transmit a pluralityof sample features associated with the determined cluster to externaldevices through the communication unit; receive a first label for theplurality of sample features through the communication unit fromexternal devices, in response to the transmission; and assign thereceived first label to the determined cluster by the device.
 9. Thedevice of claim 8, wherein the received first label is generated basedon analyzing the plurality of sample features in at least one externaldevice.
 10. The device of claim 8, wherein the at least one processor isfurther configured to extract the plurality of sample features from aplurality of features representing at least an entirety of the cluster,and wherein a variance value of the extracted plurality of samplefeatures exceeds a predetermined threshold value.
 11. The device ofclaim 8, wherein the input unit is further configured to obtain a secondinput data and a second label corresponding to the second input data,and wherein the at least one processor is further configured to transmitat least one feature of the second input data and the second label to atleast one external device.
 12. The device of claim 8, further comprisinga terminal.
 13. The device of claim 8, wherein an amount of hiddenlayers of an artificial neural network in at least one external deviceis larger than an amount of hidden layers of the artificial neuralnetwork of the device.
 14. The device of claim 8, wherein the device isconfigured to communicate with external devices over a 5G communicationnetwork.
 15. A machine learning system, comprising: a first deviceconfigured to: obtain a first input data; determine, from a plurality ofclusters, a cluster to which the first input data corresponds, thedetermination being made by processing the first input data in a firstartificial neural network of a first device; transmit a plurality ofsample features associated with the determined cluster to a secondartificial neural network of a second device; and assign a first labelto the determined cluster by the first device; and the second deviceconnected to the first device and configured to: generate the firstlabel corresponding to the transmitted plurality of sample features byanalyzing the transmitted plurality of sample features in the secondartificial neural network of the second device; and transmit the firstlabel from the second device to the first device.
 16. The machinelearning system of claim 15, wherein the first device is furtherconfigured to obtain a second input data and a second labelcorresponding to the second input data, and transmit at least onefeature of the second input data and the second label from the firstdevice to the second artificial neural network of the second device, andwherein the second device is further configured to train the secondartificial neural network by using at least one feature of the secondinput data and the second label as input and output of the secondartificial neural network respectively.
 17. The machine learning systemof claim 15, wherein the first device comprises a terminal, and thesecond device comprises a server.
 18. The machine learning system ofclaim 15, wherein an amount of hidden layers of the second artificialneural network of the second device is larger than an amount of hiddenlayers of the first artificial neural network of the first device. 19.The machine learning system of claim 15, wherein the first device andthe second device communicate with each other over a 5G communicationnetwork.
 20. The machine learning system of claim 15, wherein the firstartificial neural network of the first device performs unsupervisedlearning, and the second artificial neural network of the second deviceperforms supervised learning.